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Abstract 

Consider the random set system A = R(n,p) of [n] := {1, 2, . . . n}, 
where A = {Aj : Aj G V([n]), and Aj selected with probability p = p n }. 
A set H C [n] is said to be a hitting set for A if VA,- G A \Ajf]H\ > 1. 
The second moment method is used to exhibit the sharp concentration 
of the minimal size of H for a variety of values of p. 

1 Introduction and Motivation 

A set D of vertices in a graph G = (V, E) forms a dominating set of G if each 
v G V is either in D or adjacent to some d G D. The domination number 
7 = j(G) is the size of the smallest dominating set of G. Given a graph of 
minimum degree 5, it is proved, e.g., in Alon and Spencer [1] that 

7(G) < i±fff^ (1) 



In a result of direct relevence to this paper, Weber [12] proved in 1981 that 
the domination number of the random graph G(n,p) is sharply concentrated 
w.h.p. if p is fixed. This result was extended in [TB] to the case p = p n — > 0, 
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where a two point concentration was shown to hold for 7(G(n,p n )) provided 
p n did not decay too rapidly; specifically, p = 1/ log log n works in the above 
result; p = 1/logn does not. 

Given a /c-uniform hypergraph H = (V,Ek), a transversal is a collection 
T of vertices such that each edge e G Ef~ intersects T in at least one vertex. 
We will denote the transversal number of H by t(H). A transversal is also 
called a hitting set, particularly in the Computer Science literature, where it 
is more typical than not for edges to be of different sizes. Accordingly, we 
will reserve the terminology "transversal" for /c-uniform hypergraphs, and 
"hitting set" for the general case. In a result that echoes (1), Alon [2] proved 
that for a /c-uniform hypergraph with v vertices and e edges, 

r(H) < (1 + (l)) l ^(v + e) {k -> oo). 

The Computer Science literature has focused more on complexity issues for 
hitting sets; see, e.g., [7], [5], [9], and [Hj. The connection between total 
domination and transversals has been explored in [XT] . 

If all our edges are of cardinality two, i.e. if we have a graph, let s ^ T, 
T a transversal. Then the only edges containing s, must be between s and 
t for t G T. Thus T is a minimal hitting set iff T c is maximal independent. 
Note that this is also true for arbitrary hypergraphs if independent sets are 
defined as collections of vertices for which there is no edge that is a subset 
of these vertices. Also, the sharp two point concentration of the maximal 
independent set in a random graph has been well understood since the early 
work of Bollobas and Erdos [6] and Matula |10] , and others. In these results 
on finite point concentration, nothing more than the second moment method 
was used, though more sophisticated machinery was employed by Alon and 
Krivelevich [3J, and Achlioptas and Naor [TJ to show the sharp concentration 
of the chromatic number of G(n, p). It will turn out that elementary methods 
will suffice in this paper; we will investigate the sharp concentration of the 
size of minimal hitting sets (or hitting number) for non-uniform hypergraphs. 

Our model consists of picking each set A C {1, 2, . . . , n} with probability 
p = p n . Let A be the ensemble of picked sets, which we will call a random set 
system and denote by R(n,p) (to mirror the G(n,p) notation for a random 
graph) . The goal is to discover a class of ps for which the hitting number is 
close to the intuitive guess of lg(p-2 n ), where throughout this paper lg = log 2 . 
In Section 2, we set the stage for when a one or two point concentration holds 
for the hitting number, and, in Sections 3 and 4, details are provided for two 
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canonical cases, namely those corresponding to p = l/2 n/3 and p = n a /2 n . 



2 Setting up the Two-Point Concentration 

Define the baseline random variable, X m , to be the number of hitting sets of 
size m. We start by exploring a lower bound on \H\. Clearly 

E(X m ) = ( n )(l -pf— < ( H ) exp{-p2™}, 
\m J \m J 

since a set of size m is hitting iff we do not pick any of the subsets of its 
complement to be in the random set system (actually we cannot by definition 
hit the empty set, so the correct exponent ought to be 2 n ~ m — 1). Let us set 
(with hindsight) m = lg(p ■ 2 n ) — y?(n)Q Thus 



P(^ m > 1) < E(X m ) < ( H ) exp{-2^} -> 



(2) 



provided that ( n ) -C exp{2^ n )}, and, using the inequality 1 — p > e p \ 
with e n = lg(l -p)~ l = 0(p), 



E(X m ) = ( )(1 - pT~ m >l n ) exp{-p2 n - m+ ^} oo (3) 

if (™) > exp{2^( n ) +e "}, where the <p functions in (2) and (3) are different. 
Since zero-one probability thresholds often occur precisely where the asso- 
ciated expected value transitions from zero to infinity, we anticipate that 
Equations (2) and (3) occur with near-consecutive values of m. 

By Chebychev's inequality, F(X m = 0) < p^p^, so to establish an up- 
per bound on \H\ it would suffice to show that the variance is an order of 

magnitude smaller than the square of the mean whenever m > mo - for some 

(") 

mo to be determined. Since X m = Ylj=i hi w here the indicator variable Ij 
equals one iff the jth m-set hits R(n,p), we have that 



1 In this paper we will encounter several functions that play a "generic" role. Examples 
of these functions are oj(n), <p(n), e n , and \x n - They are each defined differently in various 
parts of the paper, but their role is always the same, e.g. ip(n) will always denote how 
much smaller the hitting set size is than lgp • 2™ and u(n) will always be a function that 
tends to infinity at an arbitrarily slow rate. 
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E(X 2 J-E\X n 



£ E C 



j'=i 



A™) \ 



E(X rn ) - E 2 (X m ) + J2Wjh t 



so that 



A 2 A A 2 ' lJ 

where A = A m = E(X m ). Now two sets A, B of size m that intersect in r 
elements both hit ^4 iff we do not pick, as part of A, any set that is a subset 
of A c or a subset of B c ; there are 2 n " m + 2 n ~ m - 2 n ~ 2m+r of these. Thus, 
substituting s = m — r and assuming that A > 1, we have 



2 m— 1 fmA M-ro\ 



r=0 



= a 2 £a 

s=l 
m 



m \ in — m\ in 



s=l 



m\ n — m\ in 



m 



in 

2-"-l 



(5) 



By (4) and (5) it thus suffices to show that 



S>1 



m\ n — m\ n 



in 



l + o(l) 



(6) 



as A — > oo; this is really a simple statement about the function m = m(n) as 
n — > oo. Let us set up what it takes to make (6) occur: We first define, with 
s = 2(lg(m logn)), the sums 



S>SQ 



m \ in — m \ in 



in 
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s I \ s / \m 



and 

. 2" s -l 

Ei m\ I n — m \ In 

l<s<s -l 

In £ i, we first bound as follows: 

nV ~° < (™Y /2S < = i + o(i) 



ml \m 



so that 

f m \ fn—m\ 

Si < (1 + 0(1)) 7n\' - = 1 + 0(1), 

since the sum above represents almost entirely the mass of a hypergeometric 
variable with mean ~ m, provided that s C m - which holds if m > 
f2(loglogn). Turning to S 2 , we have 

* < e (:)( n r)(;" 1/2 

l<s<s -l v 7 v 7 v 

< ri 2s (n/m)- m l 2 

l<s<s -l 

< n ^o-m/2 m m/2 

_ g(2so-m/2) logn+(m/2) logm /y\ 

(where we used the bounds max{("), ( n ~ m )} < n s ; (™) > (n/m) m in the 
second display above.) We wish the estimate in (7) to be of magnitude o(l) 
and thus need 

/ 81g(mlogn)\ 
logm < 1 logm (8) 



m 

It is not too hard to check that (8) holds if m is not too small or too large; 
specifically one needs 

f2(loglogrj) < m < n — fi(logn). (9) 

So, for ms satisfying (9), we get that the hitting size is at least m + 1 if 
A = A m — > 0, while if A = A m — )• oo, then the hitting size is at most m. We 
next note that 



I n \ I n 



2 /_\2/_\-l 



.771+1/ \ m + 1 / \m 



certainly for all m £ [1, n— 3]. This leads to the conclusion that either A m — > 
or A m+ i — > oo. If both these hold, then \H\ = m + 1 w.h.p.; on the other 
hand if A m _i ->> 0; A m -» A m+ i ->■ oo, or A m ->> 0; A m+i ->■ K; A m+2 oo 
for some K £ R + , then we have a two point concentration. We summarize 
the findings of this section in the following result: 

Theorem 1. Consider the random set system A = R(n,p), where p is un- 
specified. Let T denote the interval [fi(loglogn), n — fi(logn)], where the 
constants in the Q functions can be readily specified. Let I = sup{m = m n : 
limE(X m ) = 0} and h = inf{m : limE(X m ) = oo}. Then, for suitable 
p — p n ,£ : h £ T; h — I £ {1, 2} and \H\—£+l or \H\ = h w.h.p. 

It remains to solve for m in terms of p. In the next two sections, we 
consider the "dense" case, where the hitting size is comparable to n and 
the "sparse" case, where we will seek to hit a system A of size satisfying 
l^l 1 /™ p or specificity we use the values p = l/2 n/3 ;0 < (3 < 1 and 

p = n a /2 n ;a > respectively, even though other choices could have been 
made, with the analysis being quite similar. In both sections, we seek to 
find a value of m = m{p) for which E(X m ) — > and E(X m+1 ) — > oo (or 
E(X m+2 ) -» oo). 

3 A Dense Case, p = l/2 n ^ < < 1. 

With p = l/2 n/3 and m — (1 — (3)n — tp(n), where we restrict y?(n) < lgn, 

E«.) < (:)exp{-2*> } < («) (Wy W exp{ _ 2 ,<», } . 
Stirling's formula next yields 

E(X m ) < _(/3-/ 3 (l-/3)-(^))W^J exp{-2^} 

< _^ 7 ig«^exp{-2^ (n) }, 
y/n 

where C is a universal constant, 7 = max{l, }, and 5 := — Z?)^ -1 < 

2. We thus see that 

nx m > 1) < E(X m ) -> 
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if 

2 v{n) = n In S - - In n + (In 7) (lgn) + In w(n) + In C, 

or if 

<^(n) = lg ^nln5 — — Inn + (In 7) (lgn) + \nuj(n] 
= lg(nln<f) + o(l). 

This yields 

\H\ > [(1 - /?)n - lg(nln<5) - o(l)J + 1, (10) 

where in (10), </?(n) x lg(nln^) < lgn as stipulated. 
For the lower bound, we argue as follows: 

E(X m ) = ( U \l -pf n - m > ( H ) exp{-p2— 
\m J \m J 

where e n = lg(l — p)^ 1 = 0{p). Setting m = (1 — f3)n — ip(n) (we are in 
search of a different (p(n) than in (10)) yields 



if £ < 1/2, and 



if /3 > 1/2. Simplifying as before we get E(X m ) — > 00 if 

<£>(n) = lg ^n\n5 — — Inn + (In 77) (lgn) — lncj(n)^ — e n 

= lg(nln5) -o*(l), 

where 77 = min{^^, 1}, and thus 

\H\ < [(1 - /3)n-lg(n\n5) + o*(l)]. (11) 

It is easy to verify that the <f(n) functions in (10) and (11) differ by o(l). 
Thus the worst case scenario is when these quantities straddle an integer, 
when we have a two point concentration. In the other case, we have that \H\ 
is a constant w.h.p. 
We have proved 
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Theorem 2. Let H = H(n, /3) be the size of the minimal hitting set of the 
random set system R(n, l/2 n/3 ) consisting of the ensemble that is generated 
when each set in V([n}) is independently picked with probability p = 2 _n/3 . 
Then with probability approaching unity, \H\ — h or h + 1, where 

h=[(l- f3)n - ]g(nhxS) - o(l)J + 1, 

and where the o(l) is as in the argument leading to (10). 

The next result follows immediately: 

Corollary 3. Let I = I(n, (3) be the size of the maximal independent set of 
the random set system R(n, l/2 n/3 ) Then w.h.p., \I\ = i or i + 1, where 

i = n-2-[{l-P)n- lg(n In 5) - o(l)J . 



4 A Sparse Case, p = n a /2 n 1 a > 

We now move on to the case p = n a /2 n , a > 0. Notice that as in Theorem 
2, the hitting number works out to be a just a little smaller than the value 
lgE|^4| = lg(p ■ 2 n ), which can easily be seen to be the least m such that the 
set {1,2,..., m} is expected to hit all the sets in A. 

Theorem 4. Let H = H(n, a) be the size of the minimal hitting set of the 
random set system R(n, n a /2 n ), a > 0. Then with high probability, \H\ = h 
or h + 1, where 

h = [a\gn — lg(algnlnn) — o(l)J + 1. 
Proof. Let X m be as before. We have 

(n \ /tip \ m 

) exp{- P 2 n ~ m } < (-) exp{- P 2"- m }, 

so, setting p = n a /2 n and m = a\gn — <p(n) (where we restrict by seeking 
solutions with <p(n) < 31g(lgn)), we get 

(\ a\gn 
1 , 1 1 exp{-2^)}^0 
algn — dlglgn J 
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if 



or 



= algn ^ Inn + 1 — In (algn — 31glgn)^ + lno;(n) 



<p(n) = lg ^algn ( In n + 1 — ln(algn — 31glgn)) + lnc<;(?7,)^ 
= lg(algn Inn) + o(l). 

Note that y?(n) < 31glgn in (12) if n is sufficiently large. Thus 

\H\ > [algn — lg(algnlnn) — o(l)J + 1. 

Next, setting ip(n) = lg[a(l -u n )lgnlnn] - e„, where fi n = 
m = algn — <p(n), we see that 



E(X m ) = )(1-pY 

J exp{-2^ (n)+£ "} 
v 



n a(l-£t„)lgn 

(n — m) m / e 



C\/m Vm/ n «(i-^)ign 



> 



> exp{— m /{n — m)\ 



— 



1 / ne \ v 1 



1 I ne \ 

> 
> 



> 



2C^/m\a\gn) n a(i-^ n )\gn 
2Cy/m (a\gn) al z n -^ 

I n (5+o)lglgn-31glgn 

2Cv/m (algn) al § n -^( n ) 
> n lglgn 
—> 00. 

Together with (13), this completes the proof of Theorem 4. 
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5 Open Questions 



We feel that deriving similar concentrations for hitting set size of random 
uniform hypergraphs would be of value, as would be results in which subsets 
of various sizes are picked with (a wide variety of) size-biased probabilities. 
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